163 research outputs found

    Measuring Software Performance on Linux

    Full text link
    Measuring and analyzing the performance of software has reached a high complexity, caused by more advanced processor designs and the intricate interaction between user programs, the operating system, and the processor's microarchitecture. In this report, we summarize our experience about how performance characteristics of software should be measured when running on a Linux operating system and a modern processor. In particular, (1) We provide a general overview about hardware and operating system features that may have a significant impact on timing and how they interact, (2) we identify sources of errors that need to be controlled in order to obtain unbiased measurement results, and (3) we propose a measurement setup for Linux to minimize errors. Although not the focus of this report, we describe the measurement process using hardware performance counters, which can faithfully reflect the real bottlenecks on a given processor. Our experiments confirm that our measurement setup has a large impact on the results. More surprisingly, however, they also suggest that the setup can be negligible for certain analysis methods. Furthermore, we found that our setup maintains significantly better performance under background load conditions, which means it can be used to improve software in high-performance applications

    A Valgrind Tool to Compute the Working Set of a Software Process

    Full text link
    This paper introduces a new open-source tool for the dynamic analyzer Valgrind. The tool measures the amount of memory that is actively being used by a process at any given point in time. While there exist numerous tools to measure the memory requirements of a process, the vast majority only focuses on metrics like resident or proportional set sizes, which include memory that was once claimed, but is momentarily disused. Consequently, such tools do not permit drawing conclusions about how much cache or RAM a process actually requires at each point in time, and thus cannot be used for performance debugging. The few tools which do measure only actively used memory, however, have limitations in temporal resolution and introspection. In contrast, our tool offers an easy way to compute the memory that has recently been accessed at any point in time, reflecting how cache and RAM requirements change over time. In particular, this tool computes the set of memory references made within a fixed time interval before any point in time, known as the working set, and captures call stacks for interesting peaks in the working set size. We first introduce the tool, then we run some examples comparing the output from our tool with similar memory tools, and we close with a discussion of limitationsComment: 8 page

    Thermodynamics of FRW Universe With Chaplygin Gas Models

    Full text link
    In this paper we have examined the validity of the generalized second law of thermodynamics (GSLT) in an expanding Friedmann Robertson Walker (FRW) universe filled with different variants of Chaplygin gases. Assuming that the universe is a closed system bounded by the cosmological horizon, we first present the general prescription for the rate of change of total entropy on the boundary. In the subsequent part we have analyzed the validity of the generalised second law of thermodynamics on the cosmological apparent horizon and the cosmological event horizon for different Chaplygin gas models of the universe. The analysis is supported with the help of suitable graphs to clarify the status of the GSLT on the cosmological horizons. In the case of the cosmological apparent horizon we have found that some of these models always obey the GSLT, whereas the validity of GSLT on the cosmological event horizon of all these models depend on the choice of free parameters in the respective models.Comment: 20 pages, 19 figures, final version published online in General Relativity and Gravitatio

    How Reliable is Smartphone-based Electronic Contact Tracing for COVID-19?

    Full text link
    Smartphone-based electronic contact tracing is currently considered an essential tool towards easing lockdowns, curfews, and shelter-in-place orders issued by most governments around the world in response to the 2020 novel coronavirus (SARS-CoV-2) crisis. While the focus on developing smartphone-based contact tracing applications or apps has been on privacy concerns stemming from the use of such apps, an important question that has not received sufficient attention is: How reliable will such smartphone-based electronic contact tracing be? This is a technical question related to how two smartphones reliably register their mutual proximity. Here, we examine in detail the technical prerequisites required for effective smartphone-based contact tracing. The underlying mechanism that any contact tracing app relies on is called Neighbor Discovery (ND), which involves smartphones transmitting and scanning for Bluetooth signals to record their mutual presence whenever they are in close proximity. The hardware support and the software protocols used for ND in smartphones, however, were not designed for reliable contact tracing. In this paper, we quantitatively evaluate how reliably can smartphones do contact tracing. Our results point towards the design of a wearable solution for contact tracing that can overcome the shortcomings of a smartphone-based solution to provide more reliable and accurate contact tracing. To the best of our knowledge, this is the first study that quantifies, both, the suitability and also the drawbacks of smartphone-based contact tracing. Further, our results can be used to parameterize a ND protocol to maximize the reliability of any contact tracing app that uses it

    Slotless Protocols for Fast and Energy-Efficient Neighbor Discovery

    Full text link
    In mobile ad-hoc networks, neighbor discovery protocols are used to find surrounding devices and to establish a first contact between them. Since the clocks of the devices are not synchronized and their energy-budgets are limited, usually duty-cycled, asynchronous discovery protocols are applied. Only if two devices are awake at the same point in time, they can rendezvous. Currently, time-slotted protocols, which subdivide time into multiple intervals with equal lengths (slots), are considered to be the most efficient discovery schemes. In this paper, we break away from the assumption of slotted time. We propose a novel, continuous-time discovery protocol, which temporally decouples beaconing and listening. Each device periodically sends packets with a certain interval, and periodically listens for a given duration with a different interval. By optimizing these interval lengths, we show that this scheme can, to the best of our knowledge, outperform all known protocols such as DISCO, U-Connect or Searchlight significantly. For example, Searchlight takes up to 740 % longer than our proposed technique to discover a device with the same duty-cycle. Further, our proposed technique can also be applied in widely-used asymmetric purely interval-based protocols such as ANT or Bluetooth Low Energy

    Highway traffic data: macroscopic, microscopic and criticality analysis for capturing relevant traffic scenarios and traffic modeling based on the highD data set

    Full text link
    This work provides a comprehensive analysis on naturalistic driving behavior for highways based on the highD data set. Two thematic fields are considered. First, some macroscopic and microscopic traffic statistics are provided. These include the traffic flow rate and the traffic density, as well as velocity, acceleration and distance distributions. Additionally, the dependencies to each other are examined and compared to related work. The second part investigates the distributions of criticality measures. The Time-To-Collision, Time-Headway and a third measure, which couples both, are analyzed. These measures are also combined with other indicators. Scenarios, in which these measures reach a critical level, are separately discussed. The results are compared to related work as well. The two main contributions of this work can be stated as follows. First, the analysis on the criticality measures can be used to find suitable thresholds for rare traffic scenarios. Second, the statistics provided in this work can also be utilized for traffic modeling, for example in simulation environments

    Cost-effective Energy Monitoring of a Zynq-based Real-time System including dual Gigabit Ethernet

    Full text link
    The ongoing integration of fine-grained power management features already established in CPU-driven Systems-on-Chip (SoCs) enables both traditional Field Programmable Gate Arrays (FPGAs) and, more recently, hybrid Programmable SoCs (pSoCs) to reach more energy-sensitive application domains (such as, e.g., automotive and robotics). By combining a fixed-function multi-core SoC with flexible, configurable FPGA fabric, the latter can be used to realize heterogeneous Real-time Systems (RTSs) commonly implementing complex application-specific architectures with high computation and communication (I/O) densities. Their dynamic changes in workload, currently active power saving features and thus power consumption require precise voltage and current sensing on all relevant supply rails to enable dependable evaluation of the various power management techniques. In this paper, we propose a low-cost 18-channel 16-bit-resolution measurement (sub-)system capable of 200 kSPS (kilo-samples per second) for instrumentation of current pSoC development boards. To this end, we join simultaneously sampling analog-to-digital converters (ADCs) and analog voltage/current sensing circuitry with a Cortex M7 microcontroller using an SD card for storage. In addition, we propose to include crucial I/O components such as Ethernet PHYs into the power monitoring to gain a holistic view on the RTS's temporal behavior covering not only computation on FPGA and CPUs, but also communication in terms of, e.g., reception of sensor values and transmission of actuation signals. We present an FMC-sized implementation of our measurement system combined with two Gigabit Ethernet PHYs and one HDMI input. Paired with Xilinx' ZC702 development board, we are able to synchronously acquire power traces of a Zynq pSoC and the two PHYs precise enough to identify individual Ethernet frames.Comment: 4 pages, 4 figure

    Precise Energy Modeling for the Bluetooth Low Energy Protocol

    Full text link
    Bluetooth Low Energy (BLE) is a wireless protocol well suited for ultra-low-power sensors running on small batteries. BLE is described as a new protocol in the official Bluetooth 4.0 specification. To design energy-efficient devices, the protocol provides a number of parameters that need to be optimized within an energy, latency and throughput design space. To minimize power consumption, the protocol parameters have to be optimized for a given application. Therefore, an energy-model that can predict the energy consumption of a BLE-based wireless device for different parameter value settings, is needed. As BLE differs from the original Bluetooth significantly, models for Bluetooth cannot be easily applied to the BLE protocol. Since the last one year, there have been a couple of proposals on energy models for BLE. However, none of them can model all the operating modes of the protocol. This paper presents a precise energy model of the BLE protocol, that allows the computation of a device's power consumption in all possible operating modes. To the best of our knowledge, our proposed model is not only one of the most accurate ones known so far (because it accounts for all protocol parameters), but it is also the only one that models all the operating modes of BLE. Furthermore, we present a sensitivity analysis of the different parameters on the energy consumption and evaluate the accuracy of the model using both discrete event simulation and actual measurements. Based on this model, guidelines for system designers are presented, that help choosing the right parameters for optimizing the energy consumption for a given application

    Neighbor discovery latency in BLE-like duty-cycled protocols

    Full text link
    Neighbor discovery is the procedure using which two wireless devices initiate a first contact. In low power ad-hoc networks, radios are duty-cycled and the latency until a packet meets a reception phase of another device is determined by a random process. Most research considers slotted protocols, in which the points in time for reception are temporally coupled to beacon transmissions. In contrast, many recent protocols, such as ANT/ANT+ and Bluetooth Low Energy (BLE) use a slotless, periodic-interval based scheme for neighbor discovery. Here, one device periodically broadcasts packets, whereas the other device periodically listens to the channel. Both periods are independent from each other and drawn over continuous time. Such protocols provide 3 degrees of freedom (viz., the intervals for advertising and scanning and the duration of each scan phase). Though billions of existing BLE devices rely on these protocols, neither their expected latencies nor beneficial configurations with good latency-duty-cycle relations are known. Parametrizations for the participating devices are usually determined based on a "good guess". In this paper, we for the first time present a mathematical theory which can compute the neighbor discovery latencies for all possible parametrizations. Further, our theory shows that upper bounds on the latency can be guaranteed for all parametrizations, except for a finite number of singularities. Therefore, slotless, periodic interval-based protocols can be used in applications with deterministic latency demands, which have been reserved for slotted protocols until now. Our proposed theory can be used for analyzing the neighbor discovery latencies, for tweaking protocol parameters and for developing new protocols

    Density perturbation and cosmological evolution in the presence of magnetic field in f(R)f(R) gravity models

    Full text link
    In this paper, we have investigated the density perturbations and cosmological evolution in the FLRW universe in presence of a cosmic magnetic field, which may be assumed to mimic primordial magnetic fields. Such magnetic fields have sufficient strength to influence galaxy formation and cluster dynamics, thereby leaving an imprint on the CMB anisotropies. We have considered the FLRW universe as a representative of the isotropic cosmological model in the 1+3 covariant formalism for f(R)f(R) gravity. The propagation equations have been determined and analyzed, where we have assumed that the magnetic field is aligned uniformly along the xx-direction, resulting in a diagonal shear tensor. Subsequently,the density perturbation evolution equations have been studied and the results have been interpreted. We have also indicated how these results change in the general relativistic case and briefly mentioned the expected change in higher-order gravity theories.Comment: 11 page
    • …
    corecore